MambaVision is the first hybrid computer vision model combining the strengths of Mamba and Transformer. It enhances visual feature modeling by reconstructing the Mamba formulation and incorporates self-attention modules in the final layers of the Mamba architecture to improve long-range spatial dependency modeling.
Image Classification
Transformers